Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach

Identifieur interne : 000035 ( Main/Exploration ); précédent : 000034; suivant : 000036

Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach

Auteurs : Riaz Ahmad [Allemagne, Pakistan] ; Saeeda Naz [Pakistan] ; Muhammad Zeshan Afzal [Allemagne] ; Sayed Hassan Amin [Pakistan] ; Thomas Breuel [Allemagne]

Source :

RBID : PMC:4569441

Abstract

The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA).


Url:
DOI: 10.1371/journal.pone.0133648
PubMed: 26368566
PubMed Central: 4569441


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach</title>
<author>
<name sortKey="Ahmad, Riaz" sort="Ahmad, Riaz" uniqKey="Ahmad R" first="Riaz" last="Ahmad">Riaz Ahmad</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Shaheed Benazir Bhutto University, Sheringal, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Shaheed Benazir Bhutto University, Sheringal</wicri:regionArea>
<wicri:noRegion>Sheringal</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Naz, Saeeda" sort="Naz, Saeeda" uniqKey="Naz S" first="Saeeda" last="Naz">Saeeda Naz</name>
<affiliation wicri:level="1">
<nlm:aff id="aff003">
<addr-line>Hazara University, Department of IT, Mansehra, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Hazara University, Department of IT, Mansehra</wicri:regionArea>
<wicri:noRegion>Mansehra</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff004">
<addr-line>GGPGC No.1, Abbottabad, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>GGPGC No.1, Abbottabad</wicri:regionArea>
<wicri:noRegion>Abbottabad</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Afzal, Muhammad Zeshan" sort="Afzal, Muhammad Zeshan" uniqKey="Afzal M" first="Muhammad Zeshan" last="Afzal">Muhammad Zeshan Afzal</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Amin, Sayed Hassan" sort="Amin, Sayed Hassan" uniqKey="Amin S" first="Sayed Hassan" last="Amin">Sayed Hassan Amin</name>
<affiliation wicri:level="1">
<nlm:aff id="aff005">
<addr-line>Genie Technologies (Pvt) Ltd. Lahore, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Genie Technologies (Pvt) Ltd. Lahore</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Breuel, Thomas" sort="Breuel, Thomas" uniqKey="Breuel T" first="Thomas" last="Breuel">Thomas Breuel</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26368566</idno>
<idno type="pmc">4569441</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4569441</idno>
<idno type="RBID">PMC:4569441</idno>
<idno type="doi">10.1371/journal.pone.0133648</idno>
<date when="2015">2015</date>
<idno type="wicri:Area/Pmc/Corpus">000011</idno>
<idno type="wicri:Area/Pmc/Curation">000011</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000015</idno>
<idno type="wicri:Area/Ncbi/Merge">000239</idno>
<idno type="wicri:Area/Ncbi/Curation">000239</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000239</idno>
<idno type="wicri:Area/Main/Merge">000033</idno>
<idno type="wicri:Area/Main/Curation">000035</idno>
<idno type="wicri:Area/Main/Exploration">000035</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach</title>
<author>
<name sortKey="Ahmad, Riaz" sort="Ahmad, Riaz" uniqKey="Ahmad R" first="Riaz" last="Ahmad">Riaz Ahmad</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff002">
<addr-line>Shaheed Benazir Bhutto University, Sheringal, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Shaheed Benazir Bhutto University, Sheringal</wicri:regionArea>
<wicri:noRegion>Sheringal</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Naz, Saeeda" sort="Naz, Saeeda" uniqKey="Naz S" first="Saeeda" last="Naz">Saeeda Naz</name>
<affiliation wicri:level="1">
<nlm:aff id="aff003">
<addr-line>Hazara University, Department of IT, Mansehra, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Hazara University, Department of IT, Mansehra</wicri:regionArea>
<wicri:noRegion>Mansehra</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<nlm:aff id="aff004">
<addr-line>GGPGC No.1, Abbottabad, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>GGPGC No.1, Abbottabad</wicri:regionArea>
<wicri:noRegion>Abbottabad</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Afzal, Muhammad Zeshan" sort="Afzal, Muhammad Zeshan" uniqKey="Afzal M" first="Muhammad Zeshan" last="Afzal">Muhammad Zeshan Afzal</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Amin, Sayed Hassan" sort="Amin, Sayed Hassan" uniqKey="Amin S" first="Sayed Hassan" last="Amin">Sayed Hassan Amin</name>
<affiliation wicri:level="1">
<nlm:aff id="aff005">
<addr-line>Genie Technologies (Pvt) Ltd. Lahore, Pakistan</addr-line>
</nlm:aff>
<country xml:lang="fr">Pakistan</country>
<wicri:regionArea>Genie Technologies (Pvt) Ltd. Lahore</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Breuel, Thomas" sort="Breuel, Thomas" uniqKey="Breuel T" first="Thomas" last="Breuel">Thomas Breuel</name>
<affiliation wicri:level="3">
<nlm:aff id="aff001">
<addr-line>University of Technology, Kaiserslautern, Germany</addr-line>
</nlm:aff>
<country xml:lang="fr">Allemagne</country>
<wicri:regionArea>University of Technology, Kaiserslautern</wicri:regionArea>
<placeName>
<region type="land" nuts="2">Rhénanie-Palatinat</region>
<settlement type="city">Kaiserslautern</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">PLoS ONE</title>
<idno type="eISSN">1932-6203</idno>
<imprint>
<date when="2015">2015</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA).</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Naz, S" uniqKey="Naz S">S Naz</name>
</author>
<author>
<name sortKey="Hayat, K" uniqKey="Hayat K">K Hayat</name>
</author>
<author>
<name sortKey="Razzak, Mi" uniqKey="Razzak M">MI Razzak</name>
</author>
<author>
<name sortKey="Anwar, Mw" uniqKey="Anwar M">MW Anwar</name>
</author>
<author>
<name sortKey="Madani, Sa" uniqKey="Madani S">SA Madani</name>
</author>
<author>
<name sortKey="Khan, Su" uniqKey="Khan S">SU Khan</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Naz, S" uniqKey="Naz S">S Naz</name>
</author>
<author>
<name sortKey="Umar, Ai" uniqKey="Umar A">AI Umar</name>
</author>
<author>
<name sortKey="Shirazi, Sh" uniqKey="Shirazi S">SH Shirazi</name>
</author>
<author>
<name sortKey="Ahmed, Sb" uniqKey="Ahmed S">SB Ahmed</name>
</author>
<author>
<name sortKey="Razzak, Mi" uniqKey="Razzak M">MI Razzak</name>
</author>
<author>
<name sortKey="Siddiqi, I" uniqKey="Siddiqi I">I Siddiqi</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lorigo, Lm" uniqKey="Lorigo L">LM Lorigo</name>
</author>
<author>
<name sortKey="Govindaraju, V" uniqKey="Govindaraju V">V Govindaraju</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="M Rgner, V" uniqKey="M Rgner V">V Märgner</name>
</author>
<author>
<name sortKey="El Abed, H" uniqKey="El Abed H">H El Abed</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Pal, U" uniqKey="Pal U">U Pal</name>
</author>
<author>
<name sortKey="Sarkar, A" uniqKey="Sarkar A">A Sarkar</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Penzl, H" uniqKey="Penzl H">H Penzl</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ji, R" uniqKey="Ji R">R Ji</name>
</author>
<author>
<name sortKey="Duan, Ly" uniqKey="Duan L">LY Duan</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Yao, H" uniqKey="Yao H">H Yao</name>
</author>
<author>
<name sortKey="Yuan, J" uniqKey="Yuan J">J Yuan</name>
</author>
<author>
<name sortKey="Rui, Y" uniqKey="Rui Y">Y Rui</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Lowe, Dg" uniqKey="Lowe D">DG Lowe</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guan, T" uniqKey="Guan T">T Guan</name>
</author>
<author>
<name sortKey="He, Y" uniqKey="He Y">Y He</name>
</author>
<author>
<name sortKey="Duan, L" uniqKey="Duan L">L Duan</name>
</author>
<author>
<name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author>
<name sortKey="Gao, J" uniqKey="Gao J">J Gao</name>
</author>
<author>
<name sortKey="Yu, J" uniqKey="Yu J">J Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Luo, Y" uniqKey="Luo Y">Y Luo</name>
</author>
<author>
<name sortKey="Guan, T" uniqKey="Guan T">T Guan</name>
</author>
<author>
<name sortKey="Wei, B" uniqKey="Wei B">B Wei</name>
</author>
<author>
<name sortKey="Pan, H" uniqKey="Pan H">H Pan</name>
</author>
<author>
<name sortKey="Yu, J" uniqKey="Yu J">J Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Guan, T" uniqKey="Guan T">T Guan</name>
</author>
<author>
<name sortKey="He, Y" uniqKey="He Y">Y He</name>
</author>
<author>
<name sortKey="Gao, J" uniqKey="Gao J">J Gao</name>
</author>
<author>
<name sortKey="Yang, J" uniqKey="Yang J">J Yang</name>
</author>
<author>
<name sortKey="Yu, J" uniqKey="Yu J">J Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Wei, B" uniqKey="Wei B">B Wei</name>
</author>
<author>
<name sortKey="Guan, T" uniqKey="Guan T">T Guan</name>
</author>
<author>
<name sortKey="Yu, J" uniqKey="Yu J">J Yu</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ji, R" uniqKey="Ji R">R Ji</name>
</author>
<author>
<name sortKey="Yao, H" uniqKey="Yao H">H Yao</name>
</author>
<author>
<name sortKey="Liu, W" uniqKey="Liu W">W Liu</name>
</author>
<author>
<name sortKey="Sun, X" uniqKey="Sun X">X Sun</name>
</author>
<author>
<name sortKey="Tian, Q" uniqKey="Tian Q">Q Tian</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Ji, R" uniqKey="Ji R">R Ji</name>
</author>
<author>
<name sortKey="Duan, Ly" uniqKey="Duan L">LY Duan</name>
</author>
<author>
<name sortKey="Chen, J" uniqKey="Chen J">J Chen</name>
</author>
<author>
<name sortKey="Xie, L" uniqKey="Xie L">L Xie</name>
</author>
<author>
<name sortKey="Yao, H" uniqKey="Yao H">H Yao</name>
</author>
<author>
<name sortKey="Gao, W" uniqKey="Gao W">W Gao</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Graves, A" uniqKey="Graves A">A Graves</name>
</author>
<author>
<name sortKey="Liwicki, M" uniqKey="Liwicki M">M Liwicki</name>
</author>
<author>
<name sortKey="Fernandez, S" uniqKey="Fernandez S">S Fernández</name>
</author>
<author>
<name sortKey="Bertolami, R" uniqKey="Bertolami R">R Bertolami</name>
</author>
<author>
<name sortKey="Bunke, H" uniqKey="Bunke H">H Bunke</name>
</author>
<author>
<name sortKey="Schmidhuber, J" uniqKey="Schmidhuber J">J Schmidhuber</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Graves, A" uniqKey="Graves A">A Graves</name>
</author>
<author>
<name sortKey="Schmidhuber, J" uniqKey="Schmidhuber J">J Schmidhuber</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Yousefi, Mr" uniqKey="Yousefi M">MR Yousefi</name>
</author>
<author>
<name sortKey="Soheili, Mr" uniqKey="Soheili M">MR Soheili</name>
</author>
<author>
<name sortKey="Breuel, Tm" uniqKey="Breuel T">TM Breuel</name>
</author>
<author>
<name sortKey="Stricker, D" uniqKey="Stricker D">D Stricker</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<affiliations>
<list>
<country>
<li>Allemagne</li>
<li>Pakistan</li>
</country>
<region>
<li>Rhénanie-Palatinat</li>
</region>
<settlement>
<li>Kaiserslautern</li>
</settlement>
</list>
<tree>
<country name="Allemagne">
<region name="Rhénanie-Palatinat">
<name sortKey="Ahmad, Riaz" sort="Ahmad, Riaz" uniqKey="Ahmad R" first="Riaz" last="Ahmad">Riaz Ahmad</name>
</region>
<name sortKey="Afzal, Muhammad Zeshan" sort="Afzal, Muhammad Zeshan" uniqKey="Afzal M" first="Muhammad Zeshan" last="Afzal">Muhammad Zeshan Afzal</name>
<name sortKey="Breuel, Thomas" sort="Breuel, Thomas" uniqKey="Breuel T" first="Thomas" last="Breuel">Thomas Breuel</name>
</country>
<country name="Pakistan">
<noRegion>
<name sortKey="Ahmad, Riaz" sort="Ahmad, Riaz" uniqKey="Ahmad R" first="Riaz" last="Ahmad">Riaz Ahmad</name>
</noRegion>
<name sortKey="Amin, Sayed Hassan" sort="Amin, Sayed Hassan" uniqKey="Amin S" first="Sayed Hassan" last="Amin">Sayed Hassan Amin</name>
<name sortKey="Naz, Saeeda" sort="Naz, Saeeda" uniqKey="Naz S" first="Saeeda" last="Naz">Saeeda Naz</name>
<name sortKey="Naz, Saeeda" sort="Naz, Saeeda" uniqKey="Naz S" first="Saeeda" last="Naz">Saeeda Naz</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000035 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000035 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:4569441
   |texte=   Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:26368566" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024